A self-updating road map of The Cancer Genome Atlas

نویسندگان

  • David E. Robbins
  • Alexander Grüneberg
  • Helena F. Deus
  • Murat M. Tanik
  • Jonas S. Almeida
چکیده

MOTIVATION Since 2011, The Cancer Genome Atlas' (TCGA) files have been accessible through HTTP from a public site, creating entirely new possibilities for cancer informatics by enhancing data discovery and retrieval. Significantly, these enhancements enable the reporting of analysis results that can be fully traced to and reproduced using their source data. However, to realize this possibility, a continually updated road map of files in the TCGA is required. Creation of such a road map represents a significant data modeling challenge, due to the size and fluidity of this resource: each of the 33 cancer types is instantiated in only partially overlapping sets of analytical platforms, while the number of data files available doubles approximately every 7 months. RESULTS We developed an engine to index and annotate the TCGA files, relying exclusively on third-generation web technologies (Web 3.0). Specifically, this engine uses JavaScript in conjunction with the World Wide Web Consortium's (W3C) Resource Description Framework (RDF), and SPARQL, the query language for RDF, to capture metadata of files in the TCGA open-access HTTP directory. The resulting index may be queried using SPARQL, and enables file-level provenance annotations as well as discovery of arbitrary subsets of files, based on their metadata, using web standard languages. In turn, these abilities enhance the reproducibility and distribution of novel results delivered as elements of a web-based computational ecosystem. The development of the TCGA Roadmap engine was found to provide specific clues about how biomedical big data initiatives should be exposed as public resources for exploratory analysis, data mining and reproducible research. These specific design elements align with the concept of knowledge reengineering and represent a sharp departure from top-down approaches in grid initiatives such as CaBIG. They also present a much more interoperable and reproducible alternative to the still pervasive use of data portals. AVAILABILITY A prepared dashboard, including links to source code and a SPARQL endpoint, is available at http://bit.ly/TCGARoadmap. A video tutorial is available at http://bit.ly/TCGARoadmapTutorial. CONTACT [email protected].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Research on Updating of Urban Large Scale Road Map Based on High Resolution Remote Sensing Image

Road feature is one of the most important features of urban. The change of road reflects construction speed of an urban. map road updating timely and exactly becomes an urgent issue. Current method updating of road feature is carried out complete manually, and has the disadvantage of tending to lose changed road and low automatic level. The updating of map road feature including two parts: one ...

متن کامل

Classification of Streaming Fuzzy DEA Using Self-Organizing Map

The classification of fuzzy data is considered as the most challenging areas of data analysis and the complexity of the procedures has been obstacle to the development of new methods for fuzzy data analysis. However, there are significant advances in modeling systems in which fuzzy data are available in the field of mathematical programming. In order to exploit the results of the researches on ...

متن کامل

Spatio-temporal Modeling in Road Network Change Detection and Updating

Automatic road map updating has been one of the difficult and important research topics in the community of geomatics. An operational road map updating system should include three key components: the generation of a new version of roads, the automatic road change detection/updating and the spatio-temporal modelling of road data. Special considerations have to be given to the spatio-temporal mod...

متن کامل

A Framework for Road Change Detection and Map Updating

The updating of road network databases is crucial to many Geographic Information System (GIS) applications such as navigation, urban planning, etc. This paper presents a comprehensive framework for image-based road network updating, in which the following three tasks are performed sequentially: road extraction from imagery, road change detection and updating, and spatio-temporal modeling. For r...

متن کامل

Road Data Updating Using Tools of Matching and Map Generalization

It is one of the important ways for GIS data updating based on map generalization. This paper analyzes the main steps for road data updating based on map generalization. As the core of this updating process, matching method considering the levels analyses and selective omission based on mesh density are developed. The approach for road data updating based on these two tools is proposed, which i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2013